Search results for "reinforcement learning"

showing 10 items of 95 documents

A Reinforcement Learning Approach for User Preference-aware Energy Sharing Systems

2021

Energy Sharing Systems (ESS) are envisioned to be the future of power systems. In these systems, consumers equipped with renewable energy generation capabilities are able to participate in an energy market to sell their energy. This paper proposes an ESS that, differently from previous works, takes into account the consumers’ preference, engagement, and bounded rationality. The problem of maximizing the energy exchange while considering such user modeling is formulated and shown to be NP-Hard. To learn the user behavior, two heuristics are proposed: 1) a Reinforcement Learning-based algorithm, which provides a bounded regret and 2) a more computationally efficient heuristic, named BPT- ${K}…

Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniMathematical optimizationCorrectnessComputer Networks and CommunicationsRenewable Energy Sustainability and the EnvironmentComputer scienceHeuristicUser modelingRegretBounded rationalityReinforcement learningCoal Energy exchange Energy Sharing Systems Green products Power generation Production Reinforcement Learning Renewable energy sources User Preference Virtual Power PlantsEnergy marketHeuristics

researchProduct

Virtual Resource Allocation for Wireless Virtualized Heterogeneous Network with Hybrid Energy Supply

2022

In this work, two novel virtual user association and resource allocation algorithms are introduced for a wireless virtualized heterogeneous network with hybrid energy supply. In the considered system, macro base stations (MBSs) are supplied by the grid power and small base stations (SBSs) have the energy harvesting capability in addition to the grid power supplement. Multiple infrastructure providers (InPs) own the physical resources, i.e., BSs and radio resources. The Mobile Virtual Network Operators (MVNOs) are able to recent these resources from the InPs and operate the virtualized resources for providing services to different users. In particular, aiming to maximize the overall utility …

Optimizationenergy harvestingreinforcement learningvirtualisointiComputer scienceDistributed computingresource allocationsyväoppiminenwireless network virtualizationresursointicomputer.software_genreIndium phosphideenergian kerääminenIII-V semiconductor materialsBase stationVirtualizationHybrid power systemsWirelessResource managementElectrical and Electronic EngineeringWireless networksbusiness.industryWireless networkApplied MathematicsResource managementdeep learningVirtualizationGridComputer Science ApplicationskoneoppiminenResource allocationbusinessADMMcomputerHeterogeneous networklangattomat verkot

researchProduct

A comparison between a two feedback control loop and a reinforcement learning algorithm for compliant low-cost series elastic actuators

2020

Highly-compliant elastic actuators have become progressively prominent over the last years for a variety of robotic applications. With remarkable shock tolerance, elastic actuators are appropriate for robots operating in unstructured environments. In accordance with this trend, a novel elastic actuator was recently designed by our research group for Serpens, a low-cost, open-source and highly-compliant multi-purpose modular snake robot. To control the newly designed elastic actuators of Serpens, a two-feedback loops position control algorithm was proposed. The inner controller loop is implemented as a model reference adaptive controller (MRAC), while the outer control loop adopts a fuzzy pr…

Series (mathematics)Computer sciencebusiness.industryFeedback controlRoboticsLoop (topology)Computer Science::RoboticsVDP::Teknologi: 500Control theoryReinforcement learningArtificial intelligenceReinforcement learning algorithmActuatorbusiness

researchProduct

Evolution and Learning: Evolving Sensors in a Simple MDP Environment

2003

Natural intelligence and autonomous agents face difficulties when acting in information-dense environments. Assailed by a multitude of stimuli they have to make sense of the inflow of information, filtering and processing what is necessary, but discarding that which is unimportant. This paper aims at investigating the interactions between evolution of the sensorial channel extracting the information from the environment and the simultaneous individual adaptation of agent-control. Our particular goal is to study the influence of learning on the evolution of sensors, with learning duration being the tunable parameter. A genetic algorithm governs the evolution of sensors appropriate for the a…

Learning classifier systembusiness.industryComputer science05 social sciencesAutonomous agentExperimental and Cognitive PsychologyGrid050105 experimental psychologyTask (project management)03 medical and health sciencesBehavioral Neuroscience0302 clinical medicineGenetic algorithmReinforcement learning0501 psychology and cognitive sciencesArtificial intelligencebusinessAdaptation (computer science)030217 neurology & neurosurgeryCommunication channelAdaptive Behavior

researchProduct

Reinforcement learning approach to nonequilibrium quantum thermodynamics

2021

We use a reinforcement learning approach to reduce entropy production in a closed quantum system brought out of equilibrium. Our strategy makes use of an external control Hamiltonian and a policy gradient technique. Our approach bears no dependence on the quantitative tool chosen to characterize the degree of thermodynamic irreversibility induced by the dynamical process being considered, require little knowledge of the dynamics itself and does not need the tracking of the quantum state of the system during the evolution, thus embodying an experimentally non-demanding approach to the control of non-equilibrium quantum thermodynamics. We successfully apply our methods to the case of single- …

---Computer scienceFOS: Physical sciencesGeneral Physics and AstronomyNon-equilibrium thermodynamics01 natural sciencesSettore FIS/03 - Fisica Della Materia010305 fluids & plasmassymbols.namesakeQuantum stateSHORTCUTS0103 physical sciencesQuantum systemReinforcement learningStatistical physics010306 general physicsQuantum thermodynamicsCondensed Matter - Statistical MechanicsADIABATICITYQuantum PhysicsStatistical Mechanics (cond-mat.stat-mech)Entropy productionENTROPYsymbolsQuantum Physics (quant-ph)Hamiltonian (quantum mechanics)

researchProduct

Learning competitive pricing strategies by multi-agent reinforcement learning

2003

Abstract In electronic marketplaces automated and dynamic pricing is becoming increasingly popular. Agents that perform this task can improve themselves by learning from past observations, possibly using reinforcement learning techniques. Co-learning of several adaptive agents against each other may lead to unforeseen results and increasingly dynamic behavior of the market. In this article we shed some light on price developments arising from a simple price adaptation strategy. Furthermore, we examine several adaptive pricing strategies and their learning behavior in a co-learning scenario with different levels of competition. Q-learning manages to learn best-reply strategies well, but is e…

Economics and EconometricsControl and OptimizationManagement scienceApplied MathematicsQ-learningAgent-based computational economicsTask (project management)Competition (economics)Pricing strategiesRisk analysis (engineering)Dynamic pricingEconomicsReinforcement learningAdaptation (computer science)Journal of Economic Dynamics and Control

researchProduct

Validation of a Reinforcement Learning Policy for Dosage Optimization of Erythropoietin

2007

This paper deals with the validation of a Reinforcement Learning (RL) policy for dosage optimization of Erythropoietin (EPO). This policy was obtained using data from patients in a haemodialysis program during the year 2005. The goal of this policy was to maintain patients' Haemoglobin (Hb) level between 11.5 g/dl and 12.5 g/dl. An individual management was needed, as each patient usually presents a different response to the treatment. RL provides an attractive and satisfactory solution, showing that a policy based on RL would be much more successful in achieving the goal of maintaining patients within the desired target of Hb than the policy followed by the hospital so far. In this work, t…

business.industryManagement scienceComputer scienceMachine learningcomputer.software_genreData setWork (electrical)Robustness (computer science)ErythropoietinmedicineReinforcement learningArtificial intelligencebusinesscomputermedicine.drug

researchProduct

An AI Walk from Pharmacokinetics to Marketing

2009

This work is intended for providing a review of reallife practical applications of Artificial Intelligence (AI) methods. We focus on the use of Machine Learning (ML) methods applied to rather real problems than synthetic problems with standard and controlled environment. In particular, we will describe the following problems in next sections: • Optimization of Erythropoietin (EPO) dosages in anaemic patients undergoing Chronic Renal Failure (CRF). • Optimization of a recommender system for citizen web portal users. • Optimization of a marketing campaign. The choice of these problems is due to their relevance and their heterogeneity. This heterogeneity shows the capabilities and versatility …

Support vector machineEngineeringComputingMethodologies_PATTERNRECOGNITIONAdaptive resonance theoryArtificial neural networkbusiness.industryMultilayer perceptronReinforcement learningArtificial intelligencebusinessCluster analysisFuzzy logicHierarchical clustering

researchProduct

Weeds sampling for map reconstruction: a Markov random field approach

2012

In the past 15 years, there has been a growing interest for the study of the spatial repartition of weeds in crops, mainly because this is a prerequisite to herbicides use reduction. There has been a large variety of statistical methods developped for this problem ([5], [7], [10]). However, one common point of all of these methods is that they are based on in situ collection of data about weeds spatial repartition. A crucial problem is then to choose where, in the eld, data should be collected. Since exhaustive sampling of a eld is too costly, a lot of attention has been paid to the development of spatial sampling methods ([12], [4], [6] [9]). Classical spatial stochastic model of weeds cou…

[SDE.BE] Environmental Sciences/Biodiversity and EcologyBiodiversity and Ecology[ SDE.BE ] Environmental Sciences/Biodiversity and Ecology[STAT.TH] Statistics [stat]/Statistics Theory [stat.TH][MATH.MATH-ST]Mathematics [math]/Statistics [math.ST]Biodiversité et EcologieStatistiques (Mathématiques)[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST][STAT.TH]Statistics [stat]/Statistics Theory [stat.TH]Markov decision process;dynamic programming;reinforcement learning;adaptive sampling;Markov random field;batch;sampling cost;field approach;weed[SDE.BE]Environmental Sciences/Biodiversity and Ecology[MATH.MATH-ST] Mathematics [math]/Statistics [math.ST][ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH]

researchProduct

MARL-Ped: A multi-agent reinforcement learning based framework to simulate pedestrian groups

2014

Abstract Pedestrian simulation is complex because there are different levels of behavior modeling. At the lowest level, local interactions between agents occur; at the middle level, strategic and tactical behaviors appear like overtakings or route choices; and at the highest level path-planning is necessary. The agent-based pedestrian simulators either focus on a specific level (mainly in the lower one) or define strategies like the layered architectures to independently manage the different behavioral levels. In our Multi-Agent Reinforcement-Learning-based Pedestrian simulation framework (MARL-Ped) the situation is addressed as a whole. Each embodied agent uses a model-free Reinforcement L…

EngineeringFocus (computing)business.industryPedestriancomputer.software_genreEmbodied agentHardware and ArchitectureVirtual machineModeling and SimulationShortest path problemPath (graph theory)Reinforcement learningArtificial intelligenceMotion planningbusinesscomputerSoftwareSimulation Modelling Practice and Theory

researchProduct